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Abstract. A random boolean cellular automaton is a network of boolean 
gates where the inputs, the boolean function, and the initial state of each 
gate are chosen randomly. In this article, each gate has two inputs. Let a 
(respectively c) be the probability the the gate is assigned a constant function 
(respectively a non-canalyzing function, i.e., EQUIVALENCE or EXCLUSIVE or). 
Previous work has shown that when a > c, with probability asymptotic to 1, 
the random automaton exhibits very stable behavior: almost all of the gates 
stabilize, almost all of them are weak, i.e., they can be perturbed without 
affecting the state cycle that is entered, and the state cycle is bounded in size. 
This article gives evidence that the condition a = c is a threshold of chaotic 
behavior: with probability asymptotic to 1, almost all of the gates are still 
stable and weak, but the state cycle size is unbounded. In fact, the average 
state cycle size is superpolynomial in the number of gates. 

1. Introduction 

A topic of current interest in the theory of complex systems is the existence of 
sharp boundaries between highly ordered and chaotic behavior. Evidence for this 
phenomenon has been provided by computer simulations, where some parameter is 
varied. As the parameter passes through a certain critical region, the behavior of 
the system rapidly changes between the two extremes of stability and chaos p] . In 
this article, we examine one of the simplest, yet most intensively studied, models of 
complex systems — the random boolean cellular automaton. We present analytic 
results proving that there is such a threshold for these systems. 

Boolean cellular automata were introduced by Kauffman in J3| . He was interested 
in determining the conditions when complex systems exhibit stable behavior. Three 
ways of measuring the stability are: 

1. The proportion of gates that stabilize, i.e. eventually stop changing. 

2. The proportion of weak gates, i.e., gates that can be perturbed without af- 
fecting the state cycle that is entered. 

3. The size of the state cycle that the system eventually enters. 

The second and third of these measures are finite discrete analogues of criteria that 
are used to characterize chaos in dynamical systems. A small proportion of weak 
gates is similar to sensitivity to initial conditions, and a large state cycle is similar 
to nonperiodicity. 

Computer simulations, beginning with those described in have suggested that 
certain classes of randomly constructed boolean cellular automata possess all three 
forms of stability with high probability. The basic random model is where each gate 
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has two inputs, and the inputs, the boolean functions assigned to the gates, and 
the initial state are all chosen with uniform probability distributions. In particular, 
for each gate, each of the 16 boolean functions of two arguments has probability 
1/16 of being assigned to the gate. 

In spite of extensive experimental work on these automata, comparatively little 
has actually been proven about them. The first article containing formal proofs of 
stability in the basic model is by Luczak and Cohen 0. They show that as n — ► oo, 
for almost all random boolean cellular automata with n gates, the number of stable 
gates and the number of weak gates is asymptotic to n. They also give a nontrivial 
upper bound on the state cycle size. In Lynch ||, it was shown that by giving a 
slight bias to the probability of certain of the boolean functions assigned to the gates 
(on the order of log log nj log n), for almost all random boolean cellular automata 
with n gates, the state cycle size can be bounded above by n 7 , for some 7. However, 
the proof failed when the bias was reduced to 0, i.e. for the basic random model. 
This suggested two lines of research. First, a more extensive analysis of random 
boolean cellular automata with nonuniform probabilities of the boolean functions 
might be possible. This could be a step toward understanding more realistic models 
of complex systems. Also, the breakdown of the proof at the uniform distribution 
hinted at a threshold phenomenon. 

Treating all 16 of the two argument boolean functions individually seems to be a 
complex undertaking. A classification of the boolean functions due to Kauffman [[| 
has proven useful. He referred to certain boolean functions as canalyzing. We will 
define this precisely in the next section, but for now it suffices to note that among 
the canalyzing functions are the constant functions; i.e. the function that outputs 
regardless of its inputs and its negation that always outputs 1. Further, among 
the two-argument boolean functions, there are only two non-canalyzing functions: 
the equivalence function that outputs 1 if and only if both of its inputs have the 
same value, and its negation the exclusive OR. 

Let a (respectively c) be the probability that the boolean function assigned to a 
gate is constant (respectively noncanalyzing) . In Lynch |9| it was shown that when 
a > c, with probability asymptotic to 1, the random boolean cellular automaton is 
very stable in all three senses: almost all of the gates are stable and weak, and the 
state cycle size is bounded. 

In this article, we investigate the case a = c ^ 0. This includes the basic model 
as the special case a = c = 1/8. We prove that the first two kinds of stability 
still hold (although the bounds here are not as tight), but the state cycle size is 
unbounded for almost all automata. In fact, the average state cycle size is greater 
than any polynomial in n. Thus, the automaton still appears to be stable when 
viewed locally, i.e. at the level of a typical gate, but large state cycles are a global 
symptom of the beginning of instability. In a future article, we will describe the 
behavior when a < c. At present, it is known that the proportion of weak gates is 
less than n by a nontrivial factor. 

2. Definitions 

Let n be a natural number. A boolean cellular automaton B with n gates is a 
triple (D,F,x) where D is a directed graph with vertices 1, . . . ,n (referred to as 
gates), F = . . . , /„) is a sequence of boolean functions, and x = [x\, . . . , x n ) G 
{0,1}" (the set of 0-1 sequences of length n). In this article, each gate will have 
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indegrec two, and each boolean function will have two arguments. We say that 
gate j is an input to gate % if (j, i) is an edge of D. B is a finite state automaton 
with state set {0,1}™ and initial state x. The pair (D,F) defines the transition 
function of B in the following way. For each i = 1, . . . , n let ji < ki be the inputs 
of i. Givcny = (y 1 ,...,y n ) € {0,1}", B(y) = {fi(y jl , y kl ),■■■, f n (Vj n , Vk n ))- That 
is, the state of B at time is x, and if its state at time t is y e {0, 1}™, then its 
state at time t+1 is B(y). 

Our first set of definitions pertains to the aspects of stability that will be studied. 

Definitions 2.1. Let B = (D, F, x) be a boolean cellular automaton. 

1. We put B t {x) for the state of B at time t, and ff(x) for the value of its ith 
component, or gate, at time t. 

2. Since the number of states is finite, i.e. 2™, there exist times to < t\ such 
that B to (x) = B tl (x). Let t\ be the first time at which this occurs. Then 
B t+tl - to (x) = B t for all t > t a . We refer to the set of states {B\x) : t > t } as 
the state cycle of B, to distinguish it from a cycle of D in the graph-theoretic 
sense. 

3. Gate i stabilizes in t steps if for all t' > t, f\ (x) = f-(x). 

4. Gate i is weak if, letting x' 1 be identical to x except that its ith component is 

1 

3t Q 3<Nt(t > t => B\x) = B t+d {x 1 )). 
That is, changing the state of i does not affect the state cycle that is entered. 

The next definitions describe a property of boolean functions that plays a key 
role in the characterization of the threshold between order and chaos. 

Definitions 2.2. Let f(x\,X2) be a boolean function of two arguments. 

1. We say that / depends on argument x\ if for some v E {0, 1}, /(0, v) ^ f(l,v). 
A symmetric definition applies when / depends on X2- Similarly, if (D,F,x) 
is a boolean cellular automaton, /j = /, and the inputs of gate i are jn and 
ji2, then for m = 1, 2, i depends on j im if / depends on x m . 

2. The function / is said to be canalyzing if there is some m = 1 or 2 and 
some values u,v e {0,1} such that for all xi,x 2 € {0,1}, if x m = u then 
,f(xi,x 2 ) = v. Argument x m of / is said to be a forcing argument with 
forcing value u and forced value v. Likewise, if (D, F, x) is a boolean cellular 
automaton and /j is a canalyzing function with forcing argument x m , forcing 
value u and forced value v, then input j im is a forcing input of gate i. That 
is, if the value of j im is u at time t, then the value of i is guaranteed to be v 
at time t+1. 

All of these definitions generalize immediately to boolean functions of arbitrar- 
ily many arguments. In the case of two argument boolean functions, the only 
non-canalyzing functions are equivalence and exclusive OR. The two constant 
functions f(x,y) = and f(x,y) = 1 are trivially canalyzing, as are the four 
functions that depend on only one argument: 

f(x,y) = x, 

f( x ,y) = 

f(x,y) = y, and 
f{x,y) = -.y. 
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The remaining eight boolean functions of two arguments are canalyzing, and they 
are all similar in the sense that both arguments are forcing with a single value, and 
there is one forced value. A typical example is the OR function. Both arguments 
are forcing with 1, and the forced value is 1. 

The notion of forcing, defined next, is a combinatorial condition that is useful in 
characterizing stability. It depends on D and F, but not on x. 

Definition 2.3. Again, (D, F, x) is a boolean cellular automaton. Using induction 
on t, we define what it means for gate i to be forced to a value v in t steps. 

If fi is the constant function f(x\, x^) = v, then i is forced to v in t steps for all 
t > 0. 

If the inputs jn and ja of i are forced to U\ and u% respectively in t steps, then 
i is forced to fi(u\,U2) in t + 1 steps. 

If fi is a canalyzing function with forcing argument x m , forcing value it, and 
forced value v, and ji m is forced to u in t steps, then i is forced to v in t + 1 steps. 

By induction on t it can be seen that if i is forced in t steps, then it stabilizes 
for all initial states x in t steps. 

The following combinatorial notions will be used in characterizing forcing struc- 
tures. We assume the reader is familiar with the basic concepts of graph theory 
(see e.g. Harary Unless otherwise stated, path and cycle shall mean directed 
path and cycle in the digraph D. 

Definitions 2.4. 1. For any gate i in D with inputs jn and j&, let 

Sg(i) = {i} and 

2. Then 

Nad) = U S c"(i)- 

c<d 

That is, NJ(i) is the set of all gates that are connected to i by a path of 
length at most d. 

3. If J is a set of gates, then N~[(I) — Ui e iN~j~(i). 

4. In a similar way we define St(i) and N^(i), the set of all gates reachable 
from i by a path of length at most d. 

Note that whether i is forced in d steps is completely determined by the restric- 
tion of D and F to N^(i). 

We will examine the asymptotic behavior of random boolean cellular automata. 
For each boolean function / of two arguments, we associate a probability a/ £ [0,1], 
where J^f a f = 1- The random boolean cellular automaton with n gates is the result 
of three random processes. First, a random digraph where every gate has indegree 
two is generated. Independently for each gate, its two inputs are selected from the 
( 2 ) equally likely possibilities. Next, each gate is independently assigned a boolean 
function of two arguments, using the probability distribution (af : /: {0,1} 2 — > 
{0, 1}). Lastly, the initial state x is chosen using the uniform distribution on {0, l} n . 
We will use B = (D, F, x) to denote a random boolean cellular automaton generated 
as above. For any properties V and Q pertaining to boolean cellular automata, we 
put pr("P, n) for the probability that the random boolean cellular automaton on n 
gates has property V and pr('P|Q, n) for the conditional probability that V holds, 
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given that Q holds. Usually, we will omit the n in these expressions since it will be 
understood. Some of the properties we will investigate depend only on D and F. In 
that case, the expression describing V will involve (D. F) instead of B, and pr can 
be regarded as the probability measure on random (D,F). Similar notation will be 
used for properties that depend only on D. Random variables will be denoted by 
boldface capital letters, and E(X) will be the expectation of X. 
We classify the two argument boolean functions as follows: 

1. A contains the two constant functions. 

2. £>i contains the four canalyzing functions that depend on one argument. 

3. i?2 contains the eight canalyzing functions that depend on both arguments. 

4. C contains the two non-canalyzing functions. 

Then the probabilities that a gate is assigned a function in each of the categories 
are: 

feA 

b i=Y, a f 

f£B 2 

fee 

Lastly we put B = B\ U B 2 and b = b\ + b 2 , the probability that a gate is assigned 
a nonconstant canalyzing function. Throughout the rest of the article, we assume 
the following symmetry conditions on our distributions: 

a f{x,y) = a f(y,x) for au / £ #i 
a f{*,y) = a f(^x^ y ) for all f <E B 2 

Also, log shall always mean log 2 , and In is the natural logarithm. 

3. Local Stability 

A key idea, first stated in Q, is that almost all of the gates have sufficiently 
large neighborhoods that are trees. We will use the following version of this fact. 

Lemma 3.1. For any positive a and unbounded increasing function uj(n), 

lim pr(D has at most ui(n) (log n) 3 n 2a 

n — >oo 

gates i such that N~ logn (i) is not a tree) = 1. 

The same is true for 7V + , 

•> cy log n 

Proof. For each gate i, let Xj be the indicator random variable that is 1 if and only 
if N~ logn (i) is not a tree, and let X = X^iLi-^-i- 11 Xi = 1, then there exists a 
path P of length p < a log n beginning at some gate k and ending at i and another 
path Q of length q, 1 < q < alogn, beginning at k, disjoint from P except at k 
and its other endpoint, which must be in P. There are no more than n p ways of 
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choosing P and no more than n q 1 x p ways of choosing Q. The probability of any 
such choice is bounded above by (2/n) p+q . Therefore 

a log n a log n 

E(Xi) < E E 2 p+ W 



p=0 q=l 

< (alognfn 201 - 1 . 

Then E(X) < (alogn) 3 n 2a , and the Lemma follows by Markov's inequality. A 
similar argument applies to A+ logn . □ 

Another result we will need, from || , is a recurrence relation for the probability 
that a gate is forced, given that its in-ncighborhood is treelike. 

Lemma 3.2. For d > and v E {0, 1} let 

Pd(v) — pr(gate i is forced to v in d steps \N^(i) is a tree) and 
Pd = Pd(0) +Pd(l)< 

Then 

Pd(0)=Pd(l) 
and pd satisfies the following recurrence. 

Po = a and 

Pd+i =a + bp d + cp\. (3.3) 

The fixed points of the recursion (3.3) are a/c and 1. Consequently, when a > c, 
Pd converges to 1. We will prove this for a = c, but Figure 1 gives a graphical 
explanation of this fact. Part (a) illustrates a typical case when a > c. In this case, 
as proven in Q , the convergence is geometric. The convergence when a — c, shown 
in Part (b), is not as rapid, but is still sufficiently fast. 

Lemma 3.4. Let d be a natural number. Then 

Pd>l-— r 
ad 

Proof. Let = 1 — Pd- Then from (3.3), the recurrence for qd is 

qd+i = qd- aq 2 d (3.5) 

Letting rd = 1 j qd and using induction on d, we will finish the proof by showing 
that rd > ad. When d = 0, this is evident. By ([T^), 

1 1 - a/r d 
rd+i r d 

and so 



Td+l 



1 - a/rd 
>r d + a, 

which establishes the induction step. □ 

Our two main results on local stability are essentially generalizations of similar 
results in fj]. Theorem 3.8 also improves the lower bound on the number of weak 
gates that was given in M7 
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FIGURE 1. Examples of the convergence of pd- The dotted line •••• indicates the 
successive iterations of (3.3) from p — a towards 1. 

(a) a= 1/2, c= 1/4. 

(b) a = c=l/4. 
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Theorem 3.6. Let a < 1/2 andu>(n) be any unbounded increasing function. Then 
lim pr((_D,i ;l ) has at least nil — cj(n)/logn) 

n — >oo 

gates that are forced in alogn steps) = 1. 
Proof. Let Y be the random variable that counts the number of gates i in {D, F) 



such that N a - iosn (i) is a tree and i is not forced in alogn steps. By Lemma 3.4 

E(Y) < —E— . 
aa log n 

By Markov's inequality 

pr Y > < 



aalogny coin) 
0. 



Therefore, together with Lemma 3.1, with probability asymptotic to 1, there are at 
most 



uj(n) 



3„2a 



(logn) n 



O 



nu>(n) 
log 71 



aa log n 

gates not forced in alogn steps. □ 
Recalling that the notion of forcing is stronger than stability, we have 
Corollary 3.7. Let a < 1/2 andu(n) be any unbounded increasing function. Then 
lim pr({D,F) has at least n(l — to(n)/ \ogn) 

n — >oc 

gates that stabilize in alogn steps) = 1. 

Theorem 3.8. Let u)(n) be any unbounded increasing function. Then 
lim pr(B has at least n(l — w(n)/logn) weak gates) = 1. 

n — >oo 

Proof. We will use the following fact from j(| . 
Fact . For any gate i and natural number r, 

pr(|S+(»)|=r) = ^(I + OQ). 
Thus, for r > logn, 

pr(|^(»)|=r) = o(^) 

= O(2~ rlogr / 2 ) 
= o(n- 2 ) 

and the probability that there exists some gate with ^^(i)! > logn is asymptotic 
to 0. For r < logn, 

pr(|5+(i)| = r) = ^e- 2 + o^ 1 ' 2 ). 



By Lemma 3.1, this remains true even when the probability is conditioned on 

For any gate i and natural number d < alogn, assuming (i) is a tree, let 

4>d be the probability that there is some gate j £ N^(i) whose value is affected at 
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step d, if the value of i is changed at step 0. That is, taking x % as in Definitions 
2.1 (4), ff(x l ) 7^ ff{x). We will show by induction on d = l,...,alogn that 
4>d < 4/d. Clearly 4>\ <l. Assuming N^ logn (i) is a tree, let j 6 (i) and p be the 
probability that a change to i affects j in step 1. Since N^ +1 (i) is a tree, for any 
k £ Nf(j), a change to i affects in step d+ 1 if and only if a change to i affects j 
in step 1 and a change to j affects k in step <i. Therefore, assuming \S^~(i)\ < logn, 

Llog raj 

4> d+1 =1- Yl M\St(*)\ = r) x [(1 -p)+ p(l - <j> d )] r . (3.9) 

r=0 

We show that p = f/2. The three possibilities to consider are that fj £ B\, 
fj £ £>2, and /j £ C. Let fc be the other input of j. Assuming fj £ Bi, two out of 
the four functions in B\ result in i affecting j in step 1. That is, if i < k they are 
f(x,y) = x and f(x,y) = ->x, and similarly for k < i. Altogether, the probability 
of the first case is b\/2 by the symmetry property af^ x ^ = aj( y x y Now suppose 
fj £ £>2, and say i < k and Xk = 0. (The cases when k < i or Xk — 1 are similar.) 
Then /j(0, 0) / /j(l, 0). But fj is canalyzing on both inputs, so fj(0, 1) = fj(l, 1). 
Four out of the eight functions in B2 satisfy these conditions, and the sum of their 
probabilities is 62/2 by the symmetry property a,f^ x y ) = af!-^ Xj -, y \. The probability 
of the third case is c, so altogether p = 6/2 + c=l/2. Therefore by the Fact and 
Equation ([Oj|), 



-2 ST^ ( Z ^ < Pd) , -1/2 1 \ 

2^ j ho(n 7 logn) 

r=0 



e 



+o(n- 1/2 log 



-^~f + f +o(n- 1/2 \ogn). 

If </>d < l/(logn) 2 , then (j) d+1 < 4/(alogn) < i/(d+ 1). If <f> d > l/(logn) 2 , then 
4>d+i < 0d — 0d/4, and using the same argument that was applied to Equation 
(Q), d +i<4/(d+l). 

Now let Y be the random variable that counts the number of gates i in B such 
that N^ logn (i) is a tree and i is not weak. Then by what we have just shown, 

4n 

E(Y) < ■ 



a log n ' 



The rest of the proof proceeds as in Theorem 3.6. □ 



4. Lower Bounds on Average State Cycle Size 

4.1. Main Results. Let the random variable C denote the size of the state cycle 
of B. 

Theorem 4.1. For any constant 7 and sufficiently large n, 

E(C) > n\ 

In the next theorem, E(C|(D,F)) is the expected state cycle size of a random 
(D, F) averaged over all x £ {0, 1}". 
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Theorem 4.2. There is a constant 7 > such that 

lim pr(E(C|(D,F» > n 7 ) 



These theorems will follow from a key result (Lemma 4.15) on the probability of 
existence of certain kinds of structures in (D,F). We first define these structures 
and prove some basic facts about them. Let a be a fixed real number such that 
< a < 1/2. In the following we will put m for [alogn]. 

4.2. Vortices. 

Definition 4.3. Let B = (D,F,x) be a boolean cellular automaton on n gates. A 
vortex of circumference d consists of two disjoint subsets of gates R — {ro, . . . , r c i-i} 
and S = {so, . . . , Sd-i} satisfying the following conditions for < i < d. 

1- {n,r i+1 (mod d)) G D. 

2. { Sl ,n)eD. 

3. Si is forced in m steps. 

4. The value that Sj is forced to is not a forcing value for f ri . 

We refer to it as a vortex on R, S or simply R U S if we do not need to distinguish 
R and S. 

An example is given in Figure 2. 

The essential characteristics of such a vortex are captured by the directed labeled 
graph 

V = (Rl)S,D r (RUS),F \R,v ,...,v d - 1 ) 
where V{ is the value that is forced to, for i = 0, . . . , d — 1. That is, is simply 
the restriction of (D, F) to RUS, with the functions labeling the gates in S replaced 
by their forced values. The isomorphism class of V is called a vortex type. 

For any such vortex type r, and any V € r as above, we put A(r) for the size 
of the automorphism group on V and 7r(r) for HiLd ■ Clearly A(r) and 7r(r) 
do not depend on the choice of V G r. The significance of these two quantities is 
that (2gQ!/A(t) is the number of distinct labelings of the gates in any V G r, and 
7t(t) is the conditional probability that two disjoint subsets R and S, each of size 



d, form a vortex of type r, given that conditions (l)-(3) in Definition 4.3 hold. The 
following two facts will be used later in the combinatorial analysis of vortices. Let 
T be the set of all vortex types of circumference d. 
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FIGURE 2. A schematic diagram of a vortex of circumference 8. Shaded circles 
are members of S, and unshaded circles are in R. The enlargement shows a typical 
(si,ri) pair. In this example, s, is forced to while f rt = V (the OR function). 
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Lemma 4.4. There exists p G (0, 1) such that 

E «{r)<dp d ' 2 . 

tET 
A(r)>l 

Proof. The only nontrivial automorphisms of V G t are those that take each r,; to 
r i+p (mod d) > where 1 < p < d. But this implies 

fn=fr i+p (modd) fori = 0,...,d-l. (4.5) 

We may assume p is the minimal number satisfying (|4.5|), and therefore p\d, so 
p < d/2. Let p = maxjaj : / ^ A}, q = d/p, and T p be the set of vortex types 



satisfying (ff.5| ). Then 

E ^ w 1 

r6T p 

= p d - p 
< P d/2 - 

The factor d in the Lemma is a crude upper bound on the number of divisors of 
d. □ 

Lemma 4.6. We have 

1 - d2- d / 2 < E *(t) < 1. 

Proof. For any sequence i> = («o, • ■ ■ , Vd-i) S {0, let T„ be the set of all vortex 
types in T such that the labeling of 5 is isomorphic to v. Let U consist of all 
sequences v € {0, l} d that do not have any nontrivial cyclic permutations, and let 



T' = T-U vEU T v . Then, using the same methods as in Lemma [O], \U\ > 2 d -d2 d/2 . 
Since 

E ^( t ) = E E ^( t ) + E ^ T )> 

t£T uG(7 t£T, tET' 

we will be done by showing that for all v € {0, l} d 

E 7r ( T ) = ( 4 - 7 ) 

r£T„ 

For every i = 0, . . . , d — 1, Vi does not force f n . Therefore one of the following 
possibilities must hold. 

1. / rj G Bi, and the input on which r,; depends is r^y f mo d d)- 

2. f Ti G S2, and is not a forcing value for 

3. / r< G C. 

Case (1) has probability 6i/2 by the symmetry property dfi x ,y) = a f(y,x)i Case (2) 
has probability 62/2 by the symmetry property o-f( x ,y) = and Case (3) 

has probability c. Therefore, given that Si is labeled with Uj, the probability that 
one of the three cases above holds is 1/2, and ( |4.7| ) follows. □ 

The existence of vortices of sufficiently large prime circumference will be used 
to prove the lower bounds on average state cycle size. This is the relevance of the 
next two basic facts. When we refer to the state of a vortex, we simply mean the 
state of B restricted to R U S. 
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Lemma 4.8. A vortex enters its state cycle in at most m steps. Its state cycle is 
completely determined by the initial state of RU N^(S). 

Proof. After m steps, for each i = 0, . . . , d — 1, Si is forced to some value v. Since 
v is not a forcing value for f ri , assuming Si < r i _ l ( mod d ) (the case when Si > 
(mod d) is symmetric), f ri (v,y) — y or ->y. Let us use the notation f ri (v,y) = 
9i(y) where gi(y) = y or ->y, depending on which case holds. 

In other words, after m steps, the vortex is equivalent to a cycle of 1-input gates, 
none of which are constants. Let u — (uq, . . . ,Ud-\) be the state of these gates 
after m steps. We need only show that u reoccurs. 

Suppose there is an even number of gates such that gt(y) = ~>y. Then after 
m + d steps, the state of each will be Ui. If there is an odd number, then the 
state of each after m + 2d steps will again be Ui. In either case, the state cycle 
has been reentered in not more than 2d steps. □ 



Lemma 4.9. // the circumference of the vortex is prime, then the size of its state 
cycle is 1, 2, d, or 2d. 



Proof. From the proof of Lemma 4.S , we know that the state repeats every 2d steps, 



and thus the state cycle size is a factor of 2d. □ 

To simplify calculations in the remainder of the proofs, we condition all events 
on the following two properties. Let (3 > a be fixed. 

1. There are no distinct vortices on R, S and R' , S' respectively of circumference 
less than or equal to 2(3 log n such that 

(RUN-(S))n(R'UN-(S'))^9. 

2. For every vortex of circumference less than or equal to 2/3 log n on any R, 5, 
for all s, s' G S, 

N~(s) is a tree, 
N-(s)nN-(s') =0, and 

N-(s)nR = <D. 

A boolean cellular automaton satisfying these conditions is said to be simple. By the 
next lemma, this will not affect the asymptotic probabilities that will be computed. 

Lemma 4.10. We have 

pr((D,F) is simple) = 1 - n~ n(1) . 

Proof. One way that a boolean cellular automaton can fail Condition (1) above is if 
there exist distinct vortices on R, S and R' , S' such that RflR' ^ 0. Then there are 
gates ri ,rj G R (possibly the same) and a path of gates in R' beginning at and 
ending at rj, disjoint from R except at the endpoints. If the circumference of R is 
d and the length of the path is I, then p = d + 1 — 2 is the number of gates in R and 
the path. Letting k range over all choices of (d,l,i,j,C) such that d,l < 2/31ogn, 
< i, j < d, and C is a subset of {1, ... , n} of size p, we put X K for the indicator 
random variable that is 1 if and only if the gates in C form a cycle R and a path 
as above. Then X = X K is an upper bound on the expected number of pairs 
of vortices such that R n R' ^ 0. 
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Now 



E(X K )<p! X? - 



/ \ p- 1 
1 I n-l\ f 1 



G) V G) 7 V2 



pi 



(n - l)nP 

because p! is an upper bound on the number labelings of C, 1/ (™J is the probability 



that Condition (1) of Definition 4.3 holds for rj, [n — l)/u) i s the probability 
that Condition (1) holds for all other gates in C, and 1/2 is the probability that 
Condition (4) holds for a gate, given that Condition (3) holds. There are 0((log n) 4 ) 
choices for d, I, i, and j, and for each of these choices, there are (™) choices for C. 
Therefore 

E(X) = Odlogn^n- 1 ) 
= n- n ^. 

On the other hand, if RDR' — 0, but Condition (1) of simplicity is still violated, 
then there exists a gate g and two paths P and P' of lengths p, p' < m + 1 beginning 
at g and disjoint everywhere else, one path ending in R and the other in R' . There 
are at most n ways of chosing g, (m + 2) 2 ways of choosing p and p' , and n p+p ~ 2 x 
(2(3 log n) 2 ways of choosing the remaining gates in P and P' . The probability of 
such a choice is bounded above by (2/n) p+p . Therefore, by Markov's inequality, 
the probability that P and P' exist is bounded above by 

/ o \ p+p ' 

n^- 1 x (m + 2) 2 x (2/Slogn) 2 x f - J = ©((logn)^ 2 " 108 "^ 1 ) 

= n- n ( 1 ). 

A similar proof enables us to show that Condition (2) holds with probability 
l-n- n «. ' □ 

One final condition on vortices that will be needed is that they should enter a 
large (relative to their circumference) state cycle from many initial states. This is 
formalized by the next definition. 

Definition 4.11. A vortex of circumference d is strong if for at least 1/2 of the 
initial states of B, the state cycle of the vortex is greater than or equal to d. 

Lemma 4.12. If B is simple, then for any vortex V of circumference d > m + 2 
where d is prime, the probability that V is strong is greater than or equal to 1/2 — 
o(l). 

Proof. If at least 1/2 of the initial states take V to a state cycle of size > d, then 



we are done. Otherwise, by Lemma 4.9, at least 1/2 of the inputs take V to a fixed 
point or a 2-cycle. 

Since d > m + 2, with probability > 1 — o(l), V has at least one i such that 
f rt G C. Without loss of generality, let us assume f ro £ C, and let x be any input 
that takes V to a fixed point or 2-cycle. Let V be the vortex obtained from V 



by changing f ro to -<f ro . Using the notation of Lemma 4.8, this has the effect of 
changing g Q to ^50 • 
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Let wq and w\ be the values of r m+ i in V at times m and m + 1 respectively. 
Then, by Condition (2) in the definition of simplicity, wq and w\ are also the values 
of r m+ i in V at those times. Since V enters a fixed point or 2-cycle from x, the 
sequence of values of r m+ i beginning at time m must be wq, Wi, wq, wi, ■ ■ ■ (possibly 
w = wi). If V also enters a fixed point or 2-cycle, then the sequence of values of 
r m+ i beginning at m is also Wo,Wi,wq,Wi, . . . . In particular, assuming m is even, 
its state at time 2m + 2 is wq. If m is odd, a similar argument applies. 

For j = 0, . . . , m+1, let Uj (respectively u'j) be the value of rj in V (respectively 
V) at time m + j + 1. Then by induction on j, = ~^Uj. But then wo = u' m+l = 
-^u m+ \ = -^wq, contradiction. Therefore V' must enter a state cycle of size > d 
when started in state a;. 

To summarize, we have shown that with probability 1 — o(l), there is some gate 
in R, say ro, such that f ro £ C, and V is strong when ro is assigned one of the 
functions in C. By symmetry, the two choices are equally likely, and the Lemma 
follows. □ 



4.3. Combinatorial Lemmas. We now derive lower bounds on the probability 
of existence of sets of vortices of various circumference. Let D n C [(3 log n, 2/3 log n] 
and \D n \ — k(n) for each positive integer n. Our goal is to find an asymptotic 
estimate for the probability that B has strong vortices of circumference d, for all 
d G D n . The approach is based on sieve methods that are extensions of Ch. Jordan's 
formula and Bonferroni's inequality. The monograph of Bollobas |jj contains an 
exposition of these formulas. The extensions that we will use are described in full 
generality in Lynch 

Fixing n, put k — k(n) and index the elements of D n by d±, . . . , d^. For each 
i = 1, . . . , k let Bi be an indexed set of all subsets of {1, . . . , n} of size 2di, say 
Bi = {Cij : 1 < j < (2^.)}- For each Cy let Vij be the property " B has a strong 
vortex of circumference di on Cij." 

Take any family of sets 

S = {Si : 1 < i < k} 

such that Si C Bi. Let 

i=i y c tj eSi 7 

That is, E-(S) is the set of boolean cellular automata on n gates that have strong 
vortices on dj for each dj G Si, i — l,...,k. Let s — (s, : 1 < i < k) be a 
sequence of positive integers and 

L{s)= M E -(S)\B is simple). 

s 

\Si\=8t 

We put ^2(s) for JD i=1 Si and (r) k for (r, ...,r), the sequence of k r's, for any 
real number r. We use s > (r) k to mean > r for i = 1, . . . , k. The next two 
lemmas are applications of the extensions of Ch. Jordan's formula and Bonferroni's 
inequality mentioned earlier. 
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Lemma 4.13. We have 



pr I A B has a strong vortex of circumference di\B is simple ) 



\i=l 



5>(l) k 



Lemma 4.14. For any K > k 



(-lf^- K L(s) > 0. 



s>(l) k 
S(s)>K 

The main result of this subsection is the next lemma. 



Lemma 4.15. Let p m be as given in Lemma \3.q , k(n) — 0(logn/ loglogn), and 
Ui be the probability that a vortex of circumference di is strong, for i = 1, . . . , k(n). 
Then 



pr j B has a strong vortex of circumference di A B is simple 

= (l-n- n «)n(l-e^ CTi 



-^(log logn) 



Proof. We will show that 



( k \ 
pr I A B has a strong vortex of circumference di\B is simple I 

= {l-n- n ^)\{(l-e-^ a 



-^(log logn) 



The Lemma will follow by Lemma 4.10. For i = 1, . . . , k let Ti be the set of all 
vortex types of circumference di. Take any S = {Si : 1 < i < k} such that each 
Si C Bj, \Sj\ = su and C gh n C tj = for all (g, h) / (i, j), C gh £ S g , Cy G By 
Lemma 3.2, 



W{E-{S)\B is simple) = J] 



i=l 



(2d,)! / 1 



S A(r) " V(a). 



x I TnT x Hr x H 7 ") x CT 



By Lemmas 4.4 and 4.6, this is 



n 



(l-n" n «) 



n(n — 1) 



x (2d,)! x en 



Then, using the falling factorial power notation n— = HiLo^ 71 — ')> 

„S2d iSi fc 



n^i((2d*)0"*' 



n 



(l_ n -n(D)f_P 



n(n — 1) 



x (2dj)l x 
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Let us approximate L(s) when ^2(s) < (logn) 2 . Since 1 — x = e - x -°( x2 ) for 
x 0, (l - n-°W) SW = 1 - n~°W. Then, using Stirling's formula, 

L(g) = (l-n- n W)n (P "°, i)S< - 

z— 1 

The number of sequences s such that s > and J^(s) = (logn) 2 is bounded 
above by 



(logn) 2 



loKn 0(log " /loglos,l) = n° (1) . 



fc - 1 

For any such s, there is some i such that Si > logn. Therefore 

„0(1) 



E m 



s>(l) k 
£(s) = (logn) 2 



(logn)! 

-n(log logn) 



By Lemmas 4.13 and 4.14 (taking K — (logn) ), 

pr \ f\ B nas a strong vortex of circumference di\B is simple^ 



T-n 



(1 



-n(i)\ 



y: (-i)^- fc n 



{pt^Y 



s>(l) h 
£(iT)<(logn) 2 



e (-i) s(si - fe n 

(l) fc <s<((logn) 2 ) fe i=l 



-£2(log logrt) 



-n(log logn) 



i=l ^l<s<(logn) 2 S ' ' 

(1 - n^ 1 )) J] (l - e"^) + n-OOogfcgiO n 



Corollary 4.16. If k(n) — 0(logn/ log logn), t/ien 



pr ( ^ _B /ias a vortex of circumference di A B is simple J = n ° 



(1) 



Proof. By Lemma 3.4 
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and by Lemma 4.12| , <Ji > 1/2 — o(l). Therefore 



i=l i=l 



> ^-2/3/(aa) / , 5 )0(logn/loglogn) ( any constant > 4 wiU do ) 
= n-°W. □ 



4.4. Completion of Proofs. 



Proof of Theorem (.1. For each n let £>„ be the set of primes in [/31ogn, 2/3 log n] 



By the Prime Number Theorem |10[ |, 

/31ogn 



In log n 



Therefore by Corollary 
/ k 



4.16 



pr ( A B has a strong vortex of circumference di A B is simple 



»=l 



Take any _B satisfying the above condition. Since B is simple, with probability > 
random starting state takes each strong vortex of circumference 
t^i, i = 1, . . . , fc(n), to a state cycle of size di or 2d^. That is, for such a starting 
state, B enters a state cycle of size greater than or equal to 

_ n /31oge-o(l)_ 

Thus, with probability > n -0 ^ 1 ^, B enters a state cycle larger than n /3loge ~°( 1 \ By 
Markov's inequality, 

E(C) > rr 3 . 

Since (3 was arbitrarily large, the Theorem follows. □ 



Proof of Theorem Jj..i. Take D n as in the previous proof. Fixing n, for i — 1, . . . , k(n) 
let Xi be the indicator random variable that is 1 if and only if B has a strong vortex 
of circumference di, and X = Xa=i -^i- Then, still assuming simplicity, by Lemma 



4.15 



E(Xi) = (1 -n-^ 1 )) (l-e- p ™ a >) +n -"(iogiogn)^ 
Since k(n) = 9 (log n/ log log n) and 1 - e~P^^ > e^^ aa ^ /5, 



E(X) ~ ]T 1 



i=l 
• OO. 
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Similarly, 

fc(n) 

E(X 2 ) = ^E(X i ) + 2 ]T E(XiXj) 



i=l l<i<j<k(n) 

k(n) 

1 - e - p ™ a ' + 2 ^ f 1 - e- p - ff - ) [ I - < 

i=l l<i<j<k(n) 
2 



- (E(X)) 

Therefore by Chebyshev's inequality, for any 6 < 1, 

E(X 2 ) - (E(X)) 2 



pr(X < 5E(X)|B is simple) < 



(1-<5) 2 (E(X))2 
0. 



That is, almost all B have at least Sk(n)e 2 ' 9 /( aQ )/5 strong vortices of distinct 
prime circumferences in [(3 log n, 2/3 log n]. For all such automata, with probability 
> 2-Sk(n)e 20 (a °'/5 — n~°^ l \ the starting state leads to a state cycle larger than 
or equal to 

(/31ogn)' 5fe( " )e ~ 2 ' 3/Ca " )/5 > e /3<5e~ 2,3/(QQ) i°g«/5 

J3Se- 2f, ' {aa ^ loge/5 



11 



By Markov's inequality, 

E(C|(D,F}) > n 



/3<5e" 2,3/(aQ) loge/5-o(l) 



and we can take any 7 < /3<5e~ 2 ^/( aa ) loge/5. In fact, as noted in Corollary 4.16 , 
the 5 can be replaced by 4. □ 
Note that ( 3e~ 2/3 ^ aQ - 1 has a unique maximum when /3 = aa/2. Therefore, since 
the only restrictions on a, P, and 8 are that a < 1/2, a < (3, and S < 1, the 7 in 



Theorem 4.2 can be arbitrarily close to e 2 / a loge/ 



5. Discussion 

As mentioned in the Introduction, there have been many computer simulations 
of random boolean cellular automata, specifically the uniform distribution model 
where a = c= 1/8. The results indicate a rather slow, even sublinear, growth rate 
of the average state cycle size as a function of the number of gates. At first glance, 
the superpolynomial average size of state cycles given by Theorem 4T seems to 
contradict the experimental evidence. There are two possible resolutions to this. 
First, a = c is the border were large state cycles are just beginning to appear. 
This may not be noticeable until the number of gates is quite large. Perhaps the 
simulated automata were not large enough. 

Second, our proof shows that the large average state cycle size is due to a small 
fraction of the automata that have very large state cycles. It may be that most of 
the automata have relatively small state cycles. Our other main result (Theorem 



4.2[ ) is consistent with this. It gives a n 1 lower bound on state cycle size averaged 
over all inputs, for almost all networks (D, F). The exponent 7 is quite small. For 
a = c = 1/8, it is less than 2 x 10~ 8 . Two relevant open problems are to improve 



the lower bound in Theorem |4.2| and the upper bound for state cycle size in 
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Other computer experiments indicate that systems on the edge of chaos show 
complex computational capability. To formalize this notion in terms of the model in 
this article, we should consider random boolean cellular automata with inputs and 
outputs. Then, instead of looking at stability measures, we should try to determine 
the conditions that result in substructures that compute complex functions. If 
the experimental evidence is correct, then the a = c threshold is the region where 
these substructures arise. The techniques used here to prove the existence of large 
vortices may be applicable. 

The model studied in this article is essentially a metaphore for complex biological 
systems. Future work in this area will inevitably lead to models with more biological 
detail and accuracy. Whether such models will be mathematically tractable cannot 
be answered now, but there are some simple generalizations of our model that may 
be pertinent to this question. One example is random boolean cellular automata 
where the probabilities of the functions assigned to gates do not necessarily satisfy 
any symmetry conditions. An immediate question is whether the results of and 
this article extend to non-symmetric probabilities. Another generalization is to 
random boolean cellular automata whose gates need not have exactly two inputs. 
One-input gates are just a special type of two-input gates, but the population 
of three-input gates seems quite different because of the large proportion of non- 
canalyzing functions. 

Lastly, two technical problems are to analyze the stability of random boolean 
cellular automata without constant gates, i.e., a — and those where a < c. Results 
on the proportion of weak gates indicate that a < c is the chaotic region, but the 
proportion of stable gates and nontrivial bounds on state cycle size are not known. 
We make the following conjectures: 

1. If a < c then asymptotically a/c of the gates are stable. Recall that in this 
case, a/c is the smaller of the fixed points of the recurrence (3.3). 

2. As a — c increases, stability of the system increases. That is, the proportions 
of stable and weak gates increase, and the size of the state cycle decreases. 
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